Slide 2 - Why did we use R?
Topic:
- A bone marrow transplant is the replacement of damaged blood cells for healthy stems cells
- The success of the transplantion can depend on many different biological factors
- These factors influences the recovery of the immune system after the transplantion
Methodology:
R programming language is a powerful tool that allowed us to clean, visualize, organize, and manipulate the data. Therefore, we were able to understand the correlation of the different varibales affecting to the survival rate after the transplant.
Data Wrangling
Dataset: Bone marrow transplant: Children (Donated 20/04/2020)
File: bone-marrow.ariff
An Attribute-Relation File format, .ariff
Content: two destinct sections: Header & Data
With comments using: “%”
Contains Metadata
Name of Relation
List of Attributes
|
|
Steps:
- Download file into newly created data folders
- Extract metadata & data into two files: a .txt and a .tsv containing tidy data: metadata.txt.gz & data.tsv.gz
- Augment columns & binary values
Description of the data
Table 1. Information on the subjects included in the data.
| N |
67 |
33 |
45 |
9 |
32 |
| Male patients |
41 (61.2 %) |
17 (51.5 %) |
25 (55.6 %) |
7 (77.8 %) |
21 (65.6 %) |
| Median recipient age |
8 |
14 |
11 |
13 |
8 |
| Relapse |
12 (17.9 %) |
5 (15.2 %) |
6 (13.3 %) |
4 (44.4 %) |
1 (3.1 %) |
| Deceased patients |
30 (44.8 %) |
15 (45.5 %) |
19 (42.2 %) |
9 (100 %) |
12 (37.5 %) |
| Median follow-up time for survived patients (days) |
1301 |
1561 |
1867 |
NA |
1327 |
| Median survival time for deceased patients (days) |
168 |
274 |
130 |
67 |
130 |
ALL, AML, chronic and non-malignant: large patient cohort
Lymphoma patients: small patient cohort (careful interpretation needed)
A majority was male among all diseases
Median recipient age did not vary greatly among diseases, but included both pre-adolescent and adolescent ages
A minority (<18 %) experienced relapse, except in the lymphoma group (almost half)
A majority survived, except for lymphoma patients
Median follow-up time for surviving patients were around 3.5 to 6 years depending on the disease
Median survival time for deceased patients varied from around 2 (lymphoma) to around 9 (AML) months
PCA analysis
Include: (Exploratory Analysis, PCA)
Slide 6
Include:
Recipientage, Disease type, Rbodymass index VS Survival rate
The boxplot plot compares the age distribution of the recipients among the different diseases types, while dividing them by their survival status: alive and dead.
Relevant outcomes:
Patients with lymphoma had a survial rate of 0%.
Age might be associated with mortality, since older patients have higher mortality than young ones.
Every disease type exhibited higher mortality than survival.
The bar plot reveals the relationship between the BMI with survival rate. It showed that survival decreases while BMI increases. Underweight patients had the highest amount of survivors, meanwhile, the obese group was the only category which mortality exceeded survival.
Neutrophil and platelet recovery time
- No clear connection between time for neutrophil or platelet recovery and survival time
- Interpretation should be aware of few data points and influence of outliers
Slide 10
Include: - Discussion
Conclusion
perspectivation?